153 research outputs found
Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data
Use of socially generated "big data" to access information about collective
states of the minds in human societies has become a new paradigm in the
emerging field of computational social science. A natural application of this
would be the prediction of the society's reaction to a new product in the sense
of popularity and adoption rate. However, bridging the gap between "real time
monitoring" and "early predicting" remains a big challenge. Here we report on
an endeavor to build a minimalistic predictive model for the financial success
of movies based on collective activity data of online users. We show that the
popularity of a movie can be predicted much before its release by measuring and
analyzing the activity level of editors and viewers of the corresponding entry
to the movie in Wikipedia, the well-known online encyclopedia.Comment: 13 pages, Including Supporting Information, 7 Figures, Download the
dataset from: http://wwm.phy.bme.hu/SupplementaryDataS1.zi
Emergence of world-stock-market network
In the age of globalization, it is natural that the stock market of each
country is not independent form the other markets. In this case, collective
behavior could be emerged form their dependency together. This article studies
the collective behavior of a set of forty influential markets in the world
economy with the aim of exploring a global financial structure that could be
called world-stock-market network. Towards this end, we analyze the
cross-correlation matrix of the indices of these forty markets using Random
Matrix Theory (RMT). We find the degree of collective behavior among the
markets and the share of each market in their structural formation. This
finding together with the results obtained from the same calculation on four
stock markets reinforce the idea of a world financial market. Finally, we draw
the dendrogram of the cross-correlation matrix to make communities in this
abstract global market visible. The dendrogram, drawn by at least thirty
percent of correlation, shows that the world financial market comprises three
communities each of which includes stock markets with geographical proximity
Mapping the UK Webspace: Fifteen Years of British Universities on the Web
This paper maps the national UK web presence on the basis of an analysis of
the .uk domain from 1996 to 2010. It reviews previous attempts to use web
archives to understand national web domains and describes the dataset. Next, it
presents an analysis of the .uk domain, including the overall number of links
in the archive and changes in the link density of different second-level
domains over time. We then explore changes over time within a particular
second-level domain, the academic subdomain .ac.uk, and compare linking
practices with variables, including institutional affiliation, league table
ranking, and geographic location. We do not detect institutional affiliation
affecting linking practices and find only partial evidence of league table
ranking affecting network centrality, but find a clear inverse relationship
between the density of links and the geographical distance between
universities. This echoes prior findings regarding offline academic activity,
which allows us to argue that real-world factors like geography continue to
shape academic relationships even in the Internet age. We conclude with
directions for future uses of web archive resources in this emerging area of
research.Comment: To appear in the proceeding of WebSci 201
Editorial: At the Crossroads: Lessons and Challenges in Computational Social Science
The interest of physicists in economic and social questions is not new: during the last decades, we have witnessed the emergence of what is formally called nowadays sociophysics [1] and econophysics [2] that can be grouped into the common term “Interdisciplinary Physics” along with biophysics, medical physics, agrophysics, etc. With tools borrowed from statistical physics and complexity science, among others, these areas of study have already made important contributions to our understanding of how humans organize and interact in our modern society. Large scale data analyses, agent-based modeling and numerical simulations, and finally mathematical modeling, have led to the discovery of new (universal) patterns and their quantitative description in socio-economic systems..
The Digital Flynn Effect: Complexity of Posts on Social Media Increases over Time
Parents and teachers often express concern about the extensive use of social
media by youngsters. Some of them see emoticons, undecipherable initialisms and
loose grammar typical for social media as evidence of language degradation. In
this paper, we use a simple measure of text complexity to investigate how the
complexity of public posts on a popular social networking site changes over
time. We analyze a unique dataset that contains texts posted by 942, 336 users
from a large European city across nine years. We show that the chosen
complexity measure is correlated with the academic performance of users: users
from high-performing schools produce more complex texts than users from
low-performing schools. We also find that complexity of posts increases with
age. Finally, we demonstrate that overall language complexity of posts on the
social networking site is constantly increasing. We call this phenomenon the
digital Flynn effect. Our results may suggest that the worries about language
degradation are not warranted
Circadian patterns of Wikipedia editorial activity: A demographic analysis
Wikipedia (WP) as a collaborative, dynamical system of humans is an
appropriate subject of social studies. Each single action of the members of
this society, i.e. editors, is well recorded and accessible. Using the
cumulative data of 34 Wikipedias in different languages, we try to characterize
and find the universalities and differences in temporal activity patterns of
editors. Based on this data, we estimate the geographical distribution of
editors for each WP in the globe. Furthermore we also clarify the differences
among different groups of WPs, which originate in the variance of cultural and
social features of the communities of editors
Dynamics of conflicts in Wikipedia
In this work we study the dynamical features of editorial wars in Wikipedia
(WP). Based on our previously established algorithm, we build up samples of
controversial and peaceful articles and analyze the temporal characteristics of
the activity in these samples. On short time scales, we show that there is a
clear correspondence between conflict and burstiness of activity patterns, and
that memory effects play an important role in controversies. On long time
scales, we identify three distinct developmental patterns for the overall
behavior of the articles. We are able to distinguish cases eventually leading
to consensus from those cases where a compromise is far from achievable.
Finally, we analyze discussion networks and conclude that edit wars are mainly
fought by few editors only.Comment: Supporting information adde
A practical approach to language complexity: a wikipedia case study
In this paper we present statistical analysis of English texts from Wikipedia. We try to address the issue of language complexity empirically by comparing the simple English Wikipedia (Simple) to comparable samples of the main English Wikipedia (Main). Simple is supposed to use a more simplified language with a limited vocabulary, and editors are explicitly requested to follow this guideline, yet in practice the vocabulary richness of both samples are at the same level. Detailed analysis of longer units (n-grams of words and part of speech tags) shows that the language of Simple is less complex than that of Main primarily due to the use of shorter sentences, as opposed to drastically simplified syntax or vocabulary. Comparing the two language varieties by the Gunning readability index supports this conclusion. We also report on the topical dependence of language complexity, that is, that the language is more advanced in conceptual articles compared to person-based (biographical) and object-based articles. Finally, we investigate the relation between conflict and language complexity by analyzing the content of the talk pages associated to controversial and peacefully developing articles, concluding that controversy has the effect of reducing language complexity
Human-machine networks: Towards a typology and profiling framework
© Springer International Publishing Switzerland 2016. In this paper we outline an initial typology and framework for the purpose of profiling human-machine networks, that is, collective structures where humans and machines interact to produce synergistic effects. Profiling a humanmachine network along the dimensions of the typology is intended to facilitate access to relevant design knowledge and experience. In this way the profiling of an envisioned or existing human-machine network will both facilitate relevant design discussions and, more importantly, serve to identify the network type. We present experiences and results from two case trials: a crisis management system and a peerto- peer reselling network. Based on the lessons learnt from the case trials we suggest potential benefits and challenges, and point out needed future work
- …